SAVOR: Part I
Science des données Avec : Visualisation, Organisation et Reproductibilité — Introduction
June 10, 2025
Modern for Data Science
R started in 1993:
`r 2025 - 1993 =` 32 years ago
A lot has happened since !
base vs tidyverse
- R GUI vs RStudio
- Rmarkdown, and then Quarto
Data Science
Data Science is an emerging field at the crossing of
Statistics, Computer science & Data analysis
![]()
source: R for Data Science (2e), Wickham et al.
Course objectives
- be able to successfully import and transform data in (
%>% & dplyr)
- be able to choose and implement suitable and beautiful data visualizations (
ggplot2)
- be able to have a reproducible workflow through dynamic reporting
- understand the difference and commonalities between:
- software development
- data analysis
Course organization
Parts:
- (brief) recap on basics
- Dynamic reproducible reporting with
Quarto
- Data manipulation with
dplyr
- Data visualizationwith
ggplot2
In each part:
- some key theoretical concepts
- practicals exercise to develop your abilities and your autonomy
General advices
⇒ use function()
brush-up
RStudio: use up-to-date, modern, tools
use RStudio projects – always !!!
⇒ live demo
Brush up practical
open SAVOR_practical1.html and follow along…
tidyverse
Tidy + Universe
![]()
Hadley Whickam 🫶
tidy data
- each column represent a different variable
- each row represent one observation
- different observation types are stored in different tables (i.e.
data.frame)
⇒ tidyverse: a collection of packages for working with/analyzing tidy data
Other ressources
- Posit cheat sheets and webinars
- R for data science (2e), Whickam, Çetinkaya-Rundel & Grolemund 👉🌐
- What they forgot to teach you about R, Brian & Hester 👉🌐
- many more …